75 research outputs found

    On clustering procedures and nonparametric mixture estimation

    Full text link
    This paper deals with nonparametric estimation of conditional den-sities in mixture models in the case when additional covariates are available. The proposed approach consists of performing a prelim-inary clustering algorithm on the additional covariates to guess the mixture component of each observation. Conditional densities of the mixture model are then estimated using kernel density estimates ap-plied separately to each cluster. We investigate the expected L 1 -error of the resulting estimates and derive optimal rates of convergence over classical nonparametric density classes provided the clustering method is accurate. Performances of clustering algorithms are measured by the maximal misclassification error. We obtain upper bounds of this quantity for a single linkage hierarchical clustering algorithm. Lastly, applications of the proposed method to mixture models involving elec-tricity distribution data and simulated data are presented

    Statistical analysis of kk-nearest neighbor collaborative recommendation

    Get PDF
    Collaborative recommendation is an information-filtering technique that attempts to present information items that are likely of interest to an Internet user. Traditionally, collaborative systems deal with situations with two types of variables, users and items. In its most common form, the problem is framed as trying to estimate ratings for items that have not yet been consumed by a user. Despite wide-ranging literature, little is known about the statistical properties of recommendation systems. In fact, no clear probabilistic model even exists which would allow us to precisely describe the mathematical forces driving collaborative filtering. To provide an initial contribution to this, we propose to set out a general sequential stochastic model for collaborative recommendation. We offer an in-depth analysis of the so-called cosine-type nearest neighbor collaborative method, which is one of the most widely used algorithms in collaborative filtering, and analyze its asymptotic performance as the number of users grows. We establish consistency of the procedure under mild assumptions on the model. Rates of convergence and examples are also provided.Comment: Published in at http://dx.doi.org/10.1214/09-AOS759 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Silicon nanowires as negative electrode for lithium-ion microbatteries

    Get PDF
    International audienceThe increasingly demand on secondary batteries with higher specific energy densities requires the replace- ment of the actual electrode materials. With a very high theoretical capacity (4200 mAh g−1 ) at low voltage, silicon is presented as a very interesting potential candidate as negative electrode for lithium-ion micro- batteries. For the first time, the electrochemical lithium alloying/de-alloying process is proven to occur, respectively, at 0.15 V/0.45 V vs. Li+ /Li with Si nanowires (SiNWs, 200-300 nm in diameter) synthesized by chemical vapour deposition. This new three-dimensional architecture material is well suited to accom- modate the expected large volume expansion due to the reversible formation of Li-Si alloys. At present, stable capacity over ten to twenty cycles is demonstrated. The storage capacity is shown to increase with the growth temperature by a factor 3 as the temperature varies from 525 to 575 ◦ C. These results, showing an attractive working potential and large storage capacities, open up a new promising field of research

    Functional supervised classification with wavelets

    No full text
    International audienc

    Optimal bandwidth selection for variable kernel density estimates

    No full text
    International audienceIt is well established that one can improve performance of kernel density estimates by varying the bandwidth with the location and/or the sample data at hand. Our interest in this paper is in the data-based selection of a variable bandwidth within an appropriate parameterized class of functions. We present an automatic selection procedure inspired by the combinatorial tools developed in Devroye and Lugosi (2001). It is shown that the expected L 1 error of the corresponding selected estimate is up to a given constant multiple of the best possible error plus an additive term which tends to zero under mild assumptions

    Nonparametric Forecasting of the Manufacturing Output Growth with Firm-level Survey Data

    No full text
    A large majority of summary indicators derived from the individual responses to qualitative Business Tendency Surveys (which are mostly three-modality questions) result from standard aggregation and quantification methods. This is typically the case for the indicators called balances of opinion, which are currently used in short term analysis and considered by forecasters as explanatory variables in many models. In the present paper, we discuss a new statistical approach to forecast the manufacturing growth from firm-survey responses. We base our predictions on a forecasting algorithm inspired by the random forest regression method, which is known to enjoy good prediction properties. Our algorithm exploits the heterogeneity of the survey responses, works fast, is robust to noise and allows for the treatment of missing values. Starting from a real application on a French dataset related to the manufacturing sector, this procedure appears as a competitive method compared with traditional algorithms.Business Tendency Surveys, balance of opinion, short-term forecasting, manufactured production, k-nearest neighbor regression, random forecasts

    Optimal L1 bandwidth selection for variable kernel density estimates

    No full text
    It is well-established that one can improve performance of kernel density estimates by varying the bandwidth with the location and/or the sample data at hand. Our interest in this paper is in the data-based selection of a variable bandwidth within an appropriate parameterized class of functions. We present an automatic selection procedure inspired by the combinatorial tools developed in Devroye and Lugosi [2001. Combinatorial Methods in Density Estimation. Springer, New York]. It is shown that the expected L1 error of the corresponding selected estimate is up to a given constant multiple of the best possible error plus an additive term which tends to zero under mild assumptions.Variable kernel estimate Nonparametric estimation Partition Shatter coefficient
    corecore